---
output: 
  bookdown::pdf_document2:
    fig_caption: yes
    keep_tex: yes
    toc: no
    number_sections: no
    latex_engine: xelatex
    pandoc_args: --lua-filter=multibib.lua
title: |
  | An Incomplete Recipe: 
  | One-Dimensional Latent Variables Do Not Capture the Full Flavor of Democratic Support
keywords: "Democratization, public support, regime survival, measurement uncertainty, latent variables."
date: " "
editor_options: 
  markdown: 
    wrap: sentence
tables: true # enable longtable and booktabs
citation_package: natbib
citeproc: false
fontsize: 12pt
indent: true
linestretch: 1.5 # double spacing using linestretch 1.5
bibliography:
  text: dcpo-demsupport-data.bib
  app: dcpo-demsupport-data-app.bib
biblio-style: apsr
citecolor: black
linkcolor: black
endnote: no
header-includes:
      - \usepackage{array}
      - \usepackage{caption}
      - \usepackage{graphicx}
      - \usepackage{siunitx}
      - \usepackage{colortbl}
      - \usepackage{multirow}
      - \usepackage{hhline}
      - \usepackage{calc}
      - \usepackage{tabularx}
      - \usepackage{threeparttable}
      - \usepackage{wrapfig}
      - \usepackage{fullpage}
      - \usepackage{lscape} #\usepackage{lscape} better for printing, page displayed vertically, content in landscape mode, \usepackage{pdflscape} better for screen, page displayed horizontally, content in landscape mode
      - \newcommand{\blandscape}{\begin{landscape}}
      - \newcommand{\elandscape}{\end{landscape}}
      - \usepackage{titlesec}
      - \titleformat*{\section}{\normalsize\bfseries}
      - \titleformat*{\subsection}{\normalsize\itshape}
      - \usepackage{titling} #use \maketitle repeatedly
      - \usepackage{underscore}
---

\pagenumbering{gobble}

# Authors {.unnumbered}

-   Yue Hu, ORCID: <https://orcid.org/0000-0002-2829-3971>, Associate Professor, Department of Political Science, Tsinghua University, [yuehu\@tsinghua.edu.cn](mailto:yuehu@tsinghua.edu.cn){.email}
-   Yuehong Cassandra Tai, ORCID: <https://orcid.org/0000-0001-7303-7443>, Postdoctoral Fellow, Center for Social Data Analytics, Pennsylvania State University, [yhcasstai\@psu.edu](mailto:yhcasstai@psu.edu){.email}
-   Hyein Ko, ORCID: <https://orcid.org/0000-0002-9497-9656>, Postdoctoral Research Associate, Department of Political Science, University of Illinois Urbana-Champaign, [hyeink\@illinois.edu](mailto:hyeink@illinois.edu){.email}
-   Byung-Deuk Woo, ORCID: <https://orcid.org/0000-0001-6902-7576>, Assistant Professor, Department of Political Science and International Studies, Incheon National University, [byungdeukwoo\@inu.ac.kr](mailto:byungdeukwoo@inu.ac.kr){.email}
-   Frederick Solt, corresponding author, ORCID: <https://orcid.org/0000-0002-3154-6132>, Professor, Department of Political Science, University of Iowa, [frederick-solt\@uiowa.edu](mailto:frederick-solt@uiowa.edu){.email}

\pagebreak

```{=tex}
\renewcommand{\baselinestretch}{1}
\selectfont
\maketitle
\renewcommand{\baselinestretch}{1.5}
\selectfont
```

```{=tex}
\begin{abstract}
Prominent recent works have measured democratic support using a single latent variable that purports to span a single dimension from steadfast opposition to wholehearted support.
This ignores ample evidence that support for democracy is complex and multidimensional.
Here we provide a series of validation tests of the sort of cross-national time-series latent variable measures employed in recent research by reference to questions on support for liberal democracy and opposition to its erosion from multiwave surveys conducted around the world.
These tests show that, across countries and years, this latent variable is nearly orthogonal to measures of support for contestation and participation; civil liberties; institutional constraints on executive power; and prioritizing democracy over the economy, economic equality, or order.
We conclude that support for democracy in any robust sense is simply not well captured by one-dimensional latent variable.
Such measures are powerful but researchers must be mindful of their limitations.
\end{abstract}
```

_keywords_: democratic support, public opinion, liberal democracy, latent variables, measurement validation
\pagebreak

\pagenumbering{arabic}

```{r setup, include=FALSE}
options(tinytex.verbose = TRUE)

knitr::opts_chunk$set(
    echo = FALSE,
    message = FALSE,
    warning = FALSE,
    cache = TRUE,
    dpi = 600,
    fig.width=7,
    fig.height = 2.5,
    plot = function(x, options)  {
        hook_plot_tex(x, options)
    }
)

# If `DCPOtools` is not yet installed:
# remotes::install_github("fsolt/DCPOtools")

if (!require(pacman)) install.packages("pacman")
library(pacman)
# load all the packages you will use below 
p_load(
    DCPOtools,
    cmdstanr,
    tidyverse,
    here,
    countrycode,
    patchwork,
    ggthemes,
    rsdmx,
    osfr,
    kableExtra
) 

# define functions
validation_plot <- function(v_data_raw,
                            lab_x = .55, lab_y = 92,
                            theta_summary, theta_results) {
    
    # defaults per https://stackoverflow.com/a/49167744/2620381
    if ("theta_summary" %in% ls(envir = .GlobalEnv) & missing(theta_summary))
        theta_summary <- get("theta_summary", envir = .GlobalEnv)
    if ("theta_results" %in% ls(envir = .GlobalEnv) & missing(theta_results))
        theta_results <- get("theta_results", envir = .GlobalEnv)
    
    median_val <- Vectorize(function(x) median(1:x),
                            vectorize.args = "x")
    
     v_vars <- v_data_raw %>% 
        select(item0 = item,
               title,
               neg) %>% 
        unique() %>% 
        mutate(v_val = str_extract(item0, "\\d+") %>% 
                   as.numeric() %>% 
                   {median_val(.) + 1} %>%
                   floor(),
               title = factor(title, 
                              levels = v_data_raw %>%
                                  pull(title) %>%
                                  unique()))
    
    # v_val <- str_extract(v_vars, "\\d+") %>% 
    #   as.numeric() %>%
    #   median(x = 1:.) %>% 
    #   `+`(1) %>% 
    #   floor()
    
    # v_vals <- v_vars %>% 
    #   mutate(value = str_extract("\\d+") %>% 
    #            as.numeric(),
    #          v_var2 = {median_val(value) + 1} %>% floor())
    
    validation_summarized <- v_data_raw %>% 
        DCPOtools::format_dcpo(scale_q = v_vars$item0[[1]], # these arguments are required
                               scale_cp = 1) %>% # but they don't matter
        pluck("data") %>% 
        mutate(item0 = str_remove(item, " \\d or higher")) %>% 
        right_join(v_vars, by = "item0") %>%
        arrange(title) %>% 
        filter(str_detect(item, paste(v_val, "or higher"))) %>%
        mutate(iso2c = countrycode::countrycode(country,
                                                origin = "country.name",
                                                destination = "iso2c",
                                                warn = FALSE),
               prop = if_else(neg, 1-y_r/n_r, y_r/n_r),
               se = sqrt((prop*(1-prop))/n),
               prop_90 = prop + qnorm(.9)*se,
               prop_10 = prop - qnorm(.9)*se) %>%
        inner_join(theta_summary %>% select(-kk, -tt), by = c("country", "year"))
    
    # if ("title" %in% names(v_data_raw)) {
    #   v_title <- v_data_raw %>% 
    #     filter(item == v_var) %>% 
    #     pull(title) %>% 
    #     first()
    # } else {
    #   v_title <- v_var
    # } 
    
    validation_cor <- theta_results %>%
        inner_join(validation_summarized %>%
                       select(country, year, title, prop, se),
                   by = c("country", "year"),
                   relationship = "many-to-many") %>% 
        rowwise() %>% 
        mutate(sim = rnorm(1, mean = prop, sd = se)) %>% 
        ungroup() %>% 
        select(title, theta, sim, draw) %>% 
        nest(data = c(theta, sim)) %>% 
        mutate(r = lapply(data, function(df) cor(df) %>% pluck(2,1)) %>% 
                   unlist()) %>%
        select(-data) %>% 
        group_by(title) %>% 
        summarize(r = paste("R =", round(mean(r), 2)))
    
    
    validation_summarized %>%
        ggplot(aes(x = mean,
                   y = prop * 100)) +
        geom_segment(aes(x = q10, xend = q90,
                         y = prop * 100, yend = prop * 100),
                     na.rm = TRUE,
                     alpha = .2) +
        geom_segment(aes(x = mean, xend = mean,
                         y = prop_90 * 100, yend = prop_10 * 100),
                     na.rm = TRUE,
                     alpha = .2) +
        geom_smooth(method = 'lm', formula = 'y ~ x', se = FALSE) +
        facet_wrap(~ title, ncol = 4) +
        geom_label(data = validation_cor, aes(x = lab_x,
                                              y = lab_y,
                                              label = r),
                   size = 2)
}

set.seed(324)
```

```{r dcpo_input_raw, include=FALSE, cache.extra = tools::md5sum(here::here("data-raw", "surveys_demsupport.csv"))}
survey_file <- list.files(here("data-raw"),
                          full.names = TRUE) %>% 
    str_subset("surveys.*\\.csv")

surveys <- read_csv(survey_file,
                    col_types = "ccccccc")

# dcpo_input_raw <- DCPOtools::dcpo_setup(vars = surveys,
#                                         datapath = here("..",
#                                                         "data",
#                                                         "dcpo_surveys"),
#                                         file = here("data",
#                                                     "dcpo_input_raw.csv"))
```

```{r summary_stats, cache = TRUE, cache.extra = tools::md5sum(here::here("data", "dcpo_input_raw.csv"))}
dcpo_input_raw <- read_csv(here("data", "dcpo_input_raw.csv"),
                           col_types = "cdcddcd")

with_min_coverage <- function(x, min_cov) {
    if (!is.na(min_cov)) {
        country <- year <- years <- spanned <- coverage <- NULL
        
        x <- x %>%
            group_by(country) %>%
            mutate(years = length(unique(year)),
                   spanned = length(min(year):max(year)),
                   coverage = years/spanned) %>%
            filter(coverage >= min_cov) %>%
            select(-years, -spanned, -coverage) %>%
            ungroup()
    }
    return(x)
}

with_max_gap <- function(x, max_gap, edges = TRUE) {
    if (!is.na(max_gap)) {
        country <- yr_obs <- NULL
        
        c_yrs <- x %>% 
            group_by(country, year) %>% 
            summarize(year = first(year)) %>% 
            mutate(lead_span = ifelse(!is.na(lead(year)),
                                      lead(year) - year - 1,
                                      50),
                   lag_span = ifelse(!is.na(lag(year)),
                                     year - lag(year) - 1,
                                     50),
                   min_span = pmin(lead_span, lag_span),
                   max_span = pmax(lead_span, lag_span),
                   drop = min_span > max_gap & max_span == 50)
        
        x <- x %>% 
            left_join(c_yrs,
                      by = c("country", "year")) %>% 
            filter(!drop) %>% 
            select(-contains("span")) %>% 
            select(-drop)
    }
    return(x)
}

process_dcpo_input_raw <- function(dcpo_input_raw_df) {
    dcpo_input_raw_df %>% 
        with_min_yrs(3) %>% 
        with_min_cy(5) %>% 
        with_min_yrs(3) %>% # double-check after dropping <5 cy
        filter(year >= 1972 & n > 0) %>% 
        group_by(country) %>% 
        mutate(cc_rank = n()) %>% 
        ungroup() %>% 
        arrange(-cc_rank)
}

dcpo_input_raw1 <- dcpo_input_raw %>% 
    filter(!(
        str_detect(survey, "army_wvs") &
            # WVS obs identified as problematic by Claassen
            ((country == "Albania" & year == 1998) |
                 (country == "Indonesia" &
                      (year == 2001 | year == 2006)) |
                 (country == "Iran" & year == 2000) |
                 (country == "Pakistan" &
                      (year == 1997 | year == 2001)) | # 1996 in Claassen
                 (country == "Vietnam" & year == 2001)
            ) |
            (str_detect(item, "strong_wvs") &
                 ((country == "Egypt" & year == 2012) |
                      (country == "Iran" &
                           (year == 2000 | year == 2007)) | # 2005 in Claassen
                      (country == "India") |
                      (country == "Pakistan" &
                           (year == 1997 | year == 2001)) | # 1996 in Claassen
                      (country == "Kyrgyzstan" &
                           (year == 2003 | year == 2011)) |
                      (country == "Romania" &
                           (year == 1998 | year == 2005 | year == 2012)) |
                      (country == "Vietnam" & year == 2001)
                 )) |
            (survey == "pew2002" &
                 (country %in% c("Angola", "Bolivia", "Brazil", "China", "Egypt",
                                 "Guatemala", "Honduras", "India", "Indonesia", "Côte d'Ivoire",
                                 "Mali", "Pakistan", "Senegal", "Venezuela", "Vietnam"))
            ) |
            (survey == "pew2005" &
                 (country %in% c("China", "India", "Morocco", "Pakistan"))
            ) |
            (survey == "pew2007" &
                 (country %in% c("Bolivia", "Brazil", "China", "India", "Côte d'Ivoire", "Pakistan", "South Africa", "Venezuela"))
            ) |
            (
                country %in% c(
                    "Puerto Rico",
                    "Northern Ireland",
                    "SrpSka Republic",
                    "Hong Kong SAR China"
                )
            )
    )) %>% 
    process_dcpo_input_raw()

n_surveys <- surveys %>% 
    distinct(survey) %>% 
    nrow()

n_items <- dcpo_input_raw1 %>%
    distinct(item) %>% 
    nrow()

n_countries <- dcpo_input_raw1 %>%
    distinct(country) %>% 
    nrow()

n_cy <- dcpo_input_raw1 %>%
    distinct(country, year) %>% 
    nrow() %>% 
    scales::comma()

n_years <- as.integer(summary(dcpo_input_raw1$year)[6]-summary(dcpo_input_raw1$year)[1])

spanned_cy <- dcpo_input_raw1 %>% 
    group_by(country) %>% 
    summarize(years = max(year) - min(year) + 1) %>% 
    summarize(n = sum(years)) %>% 
    pull(n) %>% 
    scales::comma()

total_cy <- {n_countries * n_years} %>% 
    scales::comma()

year_range <- paste("from",
                    summary(dcpo_input_raw$year)[1],
                    "to",
                    summary(dcpo_input_raw$year)[6])

n_cyi <- dcpo_input_raw1 %>% 
    distinct(country, year, item) %>% 
    nrow() %>% 
    scales::comma()

back_to_numeric <- function(string_number) {
    string_number %>% 
        str_replace(",", "") %>% 
        as.numeric()
}

covered_share_of_spanned <- {back_to_numeric(n_cy)/back_to_numeric(spanned_cy) * 100}
```

Recent threats to liberal democracy have manifested as the slow erosion of institutions and norms rather than the sudden and violent takeover of power, the acts of politicians and elected officials rather than admirals and generals.
Considering that the ostensible backing for these antidemocratic office holders is some substantial share of the electorate, understanding the breadth and depth of citizen support for democracy is as important as ever.
On one hand, the breadth of that support for democracy remains truly impressive: @Anderson2021 [pp. 971-972], for example, estimates that some 90\% of humanity agrees that democracy is the best form of government [see also, e.g., @KirschWelzel2019; @Wuttke2022].
But on the other, a great deal of evidence indicates that these expressions of support for democracy do not run very deep, that such claims to favor democracy in the abstract do not consistently represent either commitments to _liberal_ democracy or opposition to actions that undermine it [see, e.g., @Bratton2002; @Schedler2007; @Carlin2011; @KiewietdeJonge2016; @Bryan2023].
As a systematic review of this literature emphasizes, "it is important to know not just how strongly citizens support democracy, but also _what kind of democracy it is that they support_" [@Konig2022].

These cautions notwithstanding, a number of prominent recent works have sought to better investigate democratic support by taking advantage of new latent-variable models of public opinion to estimate support for democracy across many countries and over long spans of time.
In this approach, responses to many different survey questions tapping support for democracy in the abstract are combined to overcome the sparse and scattered collection of data available on any single question; the differences among the various questions used as indicators are accounted for with careful modeling [see, e.g., @Claassen2019; @Solt2020c].
The resulting measure is then uncritically employed to explore the trends, determinants, and consequences of democratic support [@Claassen2020a; @Claassen2020b; @Claassen2022; @Tai2024; @Jacob2024].

Several narrow critiques of this line of research have already been raised [see @Tannenberg2022; @Hu2024], but we offer here a more fundamental criticism foreshadowed by the many works reviewed in @Konig2022: democratic support is multidimensional and therefore cannot be captured by any single measure.
In other words, although the latent variables employed in this research adequately summarize the survey questions on which they are based---which ask the extent of support for democracy in the abstract or for some similarly-abstract non-democratic regime---these questions do not account for the complexity of public attitudes to democracy.
As @Schedler2007 [, 637] noted nearly two decades ago, there exists a substantial number of people "who are sympathetic to democracy in the abstract, while hostile to core principles of liberal democracy in particular."
That piece listed four problems with questions about democracy in the abstract that could cause this pattern: the desire to give interviewers the socially acceptable answer, empty conceptions of democracy, competing conceptions of democracy, and other values that conflict with support for democracy [@Schedler2007, 638-640; see also, e.g., @Carlin2011; @KiewietdeJonge2016; @Bryan2023].
Regardless of the pattern's sources, we contend that compiling many questions about democracy in the abstract into a single one-dimensional latent variable should not be expected to change the fact that such questions do not accurately assess this attitudinal complexity.

We therefore test the validity of the sort of one-dimensional measure used in the recent line of latent-variable-driven research by comparing it to responses to questions drawn from many surveys used in the broader literature to map the dimensions of support for liberal democracy and opposition to its erosion.
The results of these validation tests are striking.
Across countries and years, the latent variable of democratic support is essentially orthogonal to these measures of supporting contestation and participation; civil liberties; and institutional constraints on executive power; as well as of prioritizing democracy and political freedom over the economy, economic equality, and order.
Support for democracy in any sort of robust sense is simply not well captured by a single latent variable.

We draw two conclusions.
First, researchers who construct latent variables must be attentive to issues of multidimensionality and provide ample validation for their measures.
The still novel approach to estimating latent variables across countries and over time is a powerful tool, but it does not alone solve all the difficulties of measuring public attitudes.
Second, any research employing a single latent variable of democratic support---such as @Claassen2020a, which at the time of writing has already been cited nearly 300 times---should be viewed with profound skepticism.


# The One-Dimensional Latent Variable of Democratic Support {.unnumbered}

To provide the strongest test of the validity of the one-dimensional latent variable of democratic support, we create the best version of such a variable currently possible.
Following the practice of previous research employing these latent variables, we start by identifying survey questions that asked respondents to choose between democracy and an undemocratic alternative or to evaluate either democracy or one of these undemocratic alternatives; the most frequently fielded of these survey questions asks respondents whether they strongly disagree, disagree, agree, or strongly agree with the statement, "Democracy may have its problems but it is still the best form of government," the so-called Churchill item.
Our collection of these questions of support for democracy in the abstract is similar to but expands on the data presented in Claassen [-@Claassen2020a; -@Claassen2020b] and the larger set collected by @Tai2024.
In all, we identified `r n_items` survey items on support for democracy in the abstract that were asked in no fewer than five country-years in countries surveyed at least three times; these items were drawn from `r n_surveys` different survey datasets.
In accordance with the advice offered by @Hu2022 to avoid data-entry errors by automating data collection, we then used the `DCPOtools` R package [@Solt2019] to compile the responses to these questions.
Finally, we estimated a one-dimensional latent variable of democratic support from these responses using the population-level two-parameter ordinal-logistic item-response theory (IRT) model with country-specific item-bias terms presented in @Solt2020c; that work shows that this model provides a better fit to survey questions on support for democracy in the abstract than the one-parameter logistic IRT model principally employed in the existing research on this topic, Claassen's [-@Claassen2019] Model 5.
The resulting set of one-dimensional latent variable estimates comprises 2,937 country-year observations in 136 countries.
Full details on our collection of survey-question indicators of democratic support and on the resulting latent variable estimates can be found in Appendix A.
The very similar results to the tests below that are obtained when using the estimates generated by Model 5 from the smaller dataset of indicator questions employed in Claassen [-@Claassen2020a; -@Claassen2020b] and @Tai2024 are presented in Appendix C.

```{r most_common}
# required chunk to run `dcpo_input` next

most_common_item <- dcpo_input_raw1 %>% 
    count(item) %>% 
    arrange(-n) %>% 
    slice_head() %>% 
    pull(item)

most_common_item_cy <- dcpo_input_raw1 %>% 
    filter(item == most_common_item) %>%
    distinct(country, year) %>%
    nrow()

most_common_item_surveys <- dcpo_input_raw1 %>%
    filter(item == most_common_item) %>%
    distinct(survey) %>%
    pull(survey) %>% 
    str_split(", ") %>% 
    unlist() %>% 
    unique() %>% 
    sort()

top_country_cyi <- dcpo_input_raw1 %>% 
    distinct(country, year, item) %>%
    count(country) %>%
    arrange(-n) %>% 
    slice_head() %>%
    pull(country)

top_country_cyi_obs <- dcpo_input_raw1 %>%
    filter(country == top_country_cyi) %>%
    distinct(country, year, item) %>%
    nrow()

top_country_cy <- dcpo_input_raw1 %>% 
    count(country, year) %>% 
    count(country) %>% 
    arrange(-n) %>% 
    slice_head() %>% 
    pull(country)

top_country_cy_obs <- dcpo_input_raw1 %>%
    filter(country == top_country_cy) %>%
    distinct(country, year) %>%
    nrow()

countries_cp <- dcpo_input_raw1 %>%
    mutate(country = if_else(stringr::str_detect(country, "United"),
                             stringr::str_replace(country, "((.).*) ((.).*)", "\\2.\\4."),
                             country),
           country = stringr::str_replace(country, "South", "S.")) %>% 
    distinct(country, year, item) %>%
    count(country) %>% 
    arrange(desc(n)) %>% 
    head(12) %>% 
    pull(country)

countries_cbyp <- dcpo_input_raw1 %>%
    mutate(country = if_else(stringr::str_detect(country, "United"),
                             stringr::str_replace(country, "((.).*) ((.).*)", "\\2.\\4."),
                             country),
           country = stringr::str_replace(country, "South", "S.")) %>% 
    distinct(country, year) %>%
    count(country) %>% 
    arrange(desc(n)) %>% 
    head(12) %>% 
    pull(country)

adding <- setdiff(countries_cbyp, countries_cp) %>% 
    knitr::combine_words()

dropping <- setdiff(countries_cp, countries_cbyp) %>% 
    knitr::combine_words()

y_peak_year <- dcpo_input_raw1 %>%
    distinct(country, year) %>%
    count(year, name = "nn") %>% 
    filter(nn == max(nn)) %>% 
    pull(year)

y_peak_nn <- dcpo_input_raw1 %>%
    distinct(country, year) %>%
    count(year, name = "nn") %>% 
    filter(nn == max(nn)) %>% 
    pull(nn)

data_poorest <- dcpo_input_raw1 %>%
    distinct(country, year) %>%
    count(country) %>%
    arrange(n) %>%
    filter(n == min(n)) %>%
    pull(country) %>% 
    knitr::combine_words() %>% 
    paste0("---", ., "---")

wordify_numeral <- function(x) setNames(c("one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", " seventeen", "eighteen", "nineteen"), 1:19)[x]

n_data_poor <- {data_poorest %>%
        str_split(",") %>% 
        first()} %>% 
    length() 

if (n_data_poor < 20) {
    n_data_poorest <- n_data_poor %>% 
        wordify_numeral()
} else {
    n_data_poorest <- n_data_poor
    data_poorest <- " "
}

```

```{r dcpo_input, eval=FALSE}
dcpo_input <- DCPOtools::format_dcpo(dcpo_input_raw1,
                                     scale_q = most_common_item,
                                     scale_cp = 2)
save(dcpo_input, file = here::here("data", "dcpo_input.rda"))
```

```{r dcpo, eval=FALSE}
iter <- 1000

dcpo <- cmdstan_model("~/Documents/Projects/DCPO/inst/stan/dcpo.stan")
dcpo_output <- dcpo$sample(
    data = dcpo_input[1:13], 
    max_treedepth = 14,
    adapt_delta = 0.99,
    step_size = 0.005,
    seed = 324, 
    chains = 4, 
    parallel_chains = 4,
    iter_warmup = iter/2,
    iter_sampling = iter/2,
    refresh = iter/50
)
results_path <- here::here(file.path("data", 
                                     iter, 
                                     {str_replace_all(Sys.time(), "[- :]", "") %>%
                                             str_replace("\\d{2}$", "")}))
dir.create(results_path, 
           showWarnings = FALSE, 
           recursive = TRUE)
dcpo_output$save_data_file(dir = results_path,
                           random = FALSE)
dcpo_output$save_output_files(dir = results_path,
                              random = FALSE)
```

```{r dcpo_results, cache=FALSE}
results_path <- here("data", "1000", "202301151607")

if (!exists("results_path")) {
    latest <- "202301151607"
    results_path <- here::here("data", "1000", latest)
    
    # Define OSF_PAT in .Renviron: https://docs.ropensci.org/osfr/articles/auth
    if (!file.exists(file.path(results_path, paste0("dcpo-", latest, "-1.csv")))) {
        dir.create(results_path, showWarnings = FALSE, recursive = TRUE)
        osf_retrieve_node("ksa4r") %>% 
            osf_ls_files(pattern = "csv") %>% 
            osf_download(path = results_path)
    }
    
    
    dcpo_output <- cmdstanr::as_cmdstan_fit(here::here(results_path,
                                                       list.files(results_path, pattern = "csv$")))
    
}
```

```{r dcpo_summary, eval=FALSE}
dcpo_output <- readRDS(here::here("data", "dcpo_output.rds"))
load(file = here::here("data", "dcpo_input.rda"))
theta_summary <- DCPOtools::summarize_dcpo_results(dcpo_input,
                                                   dcpo_output,
                                                   "theta")

save(theta_summary, file = here::here("data", "theta_summary.rda"))
```

```{r theta_results, eval=FALSE}
theta_results <- extract_dcpo_results(dcpo_input,
                                      dcpo_output,
                                      par = "theta")

saveRDS(theta_results, file = here::here("data", "theta_results.rds"))
```

```{r load_data}
load(here::here("data", "dcpo_input.rda"))
load(here::here("data", "theta_summary.rda"))
theta_results <- readRDS(here::here("data", "theta_results.rds"))

res_cy <- nrow(theta_summary) %>% 
    scales::comma()

res_c <- theta_summary %>% 
    pull(country) %>% 
    unique() %>% 
    length()
```


# Testing the Validity of Democratic Support as a One-dimensional Latent Variable {.unnumbered}

```{r v_data, include=FALSE}
if (!file.exists(here("data", "v_raw.rda"))) {
    v_titles <- here("data-raw", "validation_items.csv") %>%
        read_csv(col_types = "ccccccc") %>%
        select(item, title, neg, group, svy_grp) %>%
        distinct() %>% 
        mutate(title = paste0(svy_grp, ": ", title))
    
    v_raw <- here("data-raw", "validation_items.csv") %>%
        read_csv(col_types = "ccccccc") %>%
        DCPOtools::dcpo_setup(datapath = here("..",
                                              "data",
                                              "dcpo_surveys")) %>%
        left_join(v_titles, by = "item")
    
    save(v_raw, file = here("data", "v_raw.rda"))
} else {
    load(here("data", "v_raw.rda"))
}
```

```{r polyarchy, fig.cap="Correlations Between Democratic Support as a One-Dimensional Latent Variable and Polyarchy Survey Items \\label{polyarchy}", fig.height=3, cache=FALSE, fig.pos='h'}
v_vars_poly <- v_raw %>%
    filter(group == "contest" | group == "participation") %>% 
    arrange(group, title) 

validation_plot(v_vars_poly, lab_x = .55, lab_y = 15) +
    theme_bw() +
    theme(legend.position="none",
          axis.text  = element_text(size=8),
          axis.title = element_text(size=9),
          plot.title = element_text(hjust = 0.5, size = 9),
          strip.text.x = element_text(size=5),
          strip.background = element_blank()) +
    labs(x = "One-Dimensional Democratic Support",
         y = "% Agreeing")
```

To test the validity of the one-dimensional latent variable measure of democratic support, we compare it to a range of questions from multi-wave cross-national surveys that also address various aspects of support for democracy; that is, we provide a series of tests of convergent validation [see @Adcock2001, 540].
Because these survey questions were not used to estimate the latent variable---again, those indicators ask respondents to evaluate 'democracy' or a specific undemocratic alternative in the abstract---they constitute "external" validation tests [see @Caughey2019, 684-685].
We use questions that reference Dahl's [-@Dahl1971, 3-4] two "theoretical dimensions of democratization," public contestation and inclusive participation, and the institutionally guaranteed civil rights and liberties they comprise.
These criteria of liberal democracy not only comprise "the central construct in the literature on citizen preference for democracy" [@Konig2022, 2026] but are also directly implicated in the recent apprehensions regarding democratic backsliding [e.g., @Gora2022; @Ananda2023; @Meyerrose2023]. 
In light of recent concerns that citizens do not always prioritize the preferences for democracy they express in the abstract, we also draw on questions that assess democracy's relative importance.

We start with survey questions that most directly address public constestation and inclusive participation.
For example, the AmericasBarometer asks whether chief executives should "limit the voice and vote of opposition parties," the Asian Barometer questions whether "people with little or no education should have as much say in politics as highly-educated people," and at the two dimensions' point of greatest overlap, the Afrobarometer asks whether "we should choose our leaders in this country through regular, open and honest elections."
Scatter plots of the aggregate responses to these questions and other similar items against the one-dimensional latent variable of support for democracy in the abstract are presented in Figure\nobreakspace{}\@ref(fig:polyarchy).
It is evident from a glance that all of these relationships are extremely weak, essentially null.
Support for democracy in the abstract as measured by a one-dimensional latent variable has very little do with attitudes toward these core features of liberal democracy.

```{r liberties, cache = FALSE, fig.cap="Correlations Between Democratic Support as a One-Dimensional Latent Variable and Civil Liberties Survey Items \\label{liberties}", fig.height=4, fig.pos='h'}

v_vars_lib <- v_raw %>%
    filter(str_detect(group, "free")) %>% 
    arrange(desc(group), title) 

validation_plot(v_vars_lib, lab_x = .55) +
    theme_bw() +
    theme(legend.position="none",
          axis.text  = element_text(size=8),
          axis.title = element_text(size=9),
          plot.title = element_text(hjust = 0.5, size = 11),
          strip.text.x = element_text(size=6),
          strip.background = element_blank()) +
    labs(x = "One-Dimensional Democratic Support",
         y = "% Agreeing")
```

Next, we consider the survey questions that examine the level of support for civil liberties.
We compiled four items regarding free speech, three on freedom of the press, three more on freedom of assembly, and two that deal with freedom of association.
Scatter plots of the aggregate responses to these questions against the one-dimensional latent variable of democratic support are presented in Figure\nobreakspace{}\@ref(fig:liberties).
All of these correlations are weak to very weak, and several are in the opposite of the expected direction.
For example, across five waves of the Asian Barometer, the one-dimensional latent variable of democratic support has a _positive_ relationship ($R~=~0.14$) with aggregate agreement with the statement, "The government should decide whether certain ideas should be allowed to be discussed in society."
Support for democracy in the abstract appears not to conflict with favoring government censorship in those countries and years. 
The strongest relationship with democratic support so measured is found with an item on free speech from the surveys from the AmericasBarometer, which asked respondents the extent of their approval of "people who only say bad things about" the country's form of government "appearing on television to make speeches."
That correlation reaches only $R~=~−0.39$, and even then, it stands out as an exception rather than the rule.
The one-dimensional latent variable of democratic support does not capture much of the variation across countries and over time of public demand that civil liberties are respected.

```{r institutions, cache = FALSE, fig.cap="Correlations Between Democratic Support as a One-Dimensional Latent Variable and Democratic Institutions Survey Items \\label{institutions}", fig.height=3, fig.pos='h'}

v_vars_inst <- v_raw %>%
    filter(group == "rule_of_law" | group == "no_leg") %>% 
    arrange(desc(group), title) 

validation_plot(v_vars_inst,
                lab_x = .6) +
    theme_bw() +
    theme(legend.position="none",
          axis.text  = element_text(size=8),
          axis.title = element_text(size=9),
          plot.title = element_text(hjust = 0.5, size = 11),
          strip.text.x = element_text(size=7),
          strip.background = element_blank()) +
    labs(x = "One-Dimensional Democratic Support",
         y = "% Agreeing")
```

Figure\nobreakspace{}\@ref(fig:institutions) turns to the institutions that guarantee citizens' rights and liberties from government abuse, i.e., courts and the rule of law, and those that make government policies depend on citizen preferences, that is, legislatures.
The one-dimensional latent variable relates only at best weakly with aggregated rule of law items like the Asian Barometer question, "It is ok for the government to disregard the law in order to deal with the situation, when the country is facing a difficult situation."
Its correlations with items specifically on legislative checks, like the AmericasBarometer's "when the Congress hinders the work of our government, our presidents should govern without the Congress," are perhaps more consistent, but remain unimpressive.
Support for democracy in the abstract, even in the form of a latent variable, has little relationship with support for the checks and balances of horizontal accountability.

As mentioned above, recent research has documented that many who proclaim support for democracy in the abstract nevertheless have other preferences---such as those regarding policy or partisanship---that they may consider to be more important priorities [e.g., @Graham2020; @Simonovits2022; see also @Krishnarajan2022].
Survey questions on the public's prioritization of democracy are relatively rare, but we found seven relevant items.
Most straightforwardly asked respondents to choose between democracy or political freedom on the one hand and maintaining order, reducing economic inequality, or spurring economic development on the other, though the AmericasBarometer asked, "would a military coup be justified under the following circumstances: when social protest is high."
Figure\nobreakspace{}\@ref(fig:conditions) shows that the one-dimensional latent variable of democratic support bears almost no relationship at all to the aggregated responses to these questions.


```{r conditions, cache = FALSE, fig.cap="Correlations Between Democratic Support as a One-Dimensional Latent Variable and Prioritization Survey Items \\label{conditions}", fig.height=3, fig.pos='h'}

v_vars_cond <- v_raw %>%
    filter(str_detect(group, "pref_")) %>% 
    arrange(desc(group), title) 

validation_plot(v_vars_cond, lab_x = .6, lab_y = 10) +
    theme_bw() +
    theme(legend.position="none",
          axis.text  = element_text(size=8),
          axis.title = element_text(size=9),
          plot.title = element_text(hjust = 0.5, size = 11),
          strip.text.x = element_text(size=7),
          strip.background = element_blank()) +
    labs(x = "One-Dimensional Democratic Support",
         y = "% Agreeing")
```


# Conclusions {.unnumbered}

We draw two straightforward conclusions, the first methodological and the second substantive.
From a methodological standpoint, we note that advances in computer hardware and Bayesian software have made estimating latent variables a powerful tool for studying public opinion.
But latent variables of public opinion, no matter how sophisticated, are not guaranteed to be good measures of the concepts that are important to us.
That the latent variable of public opinion we examine here bears little relationship to survey items more directly tapping the concept this variable was meant to describe underscores that, as with any newly proposed measure, validation tests of such latent variables are crucially important [exemplars include @Caughey2019, 686-691; @Woo2023, 772-773].
Measures that fall short in---or are not even subjected to---validation tests cannot serve as solid foundations for research.

To better capture the multidimensionality of democratic support, two potential approaches can be considered.
The first is a confirmatory method, where separate one-dimensional models are fitted to distinct subsets of items, as illustrated by @Caughey2018 in its study of the multidimensional nature of public ideology.
The second option is an exploratory approach, such as that employed by @Pan2018china, to reveal the underlying dimensions.
A more recent advancement in this field is presented in @Berwick2024, which utilizes an exploratory group-level Bayesian IRT model to map the dynamic multidimensionality of public opinion within and across country contexts.

Substantively, our findings reinforce the extensive literature on democratic support.
Support for democracy is a complex, multidimensional concept, and many who profess to support democracy in the abstract nevertheless also endorse a variety of illiberal and undemocratic actions.
Therefore, one-dimensional measures---including those used in Claassen [-@Claassen2020a; -@Claassen2020b], @Claassen2022, @Tai2024, and @Jacob2024---are inappropriate both as a general matter and in particular with regard to research on the erosion of liberal democracies: support for democracy in any robust sense is simply not well captured by a one-dimensional latent variable.
Future research on the relationship between public opinion and democratic backsliding will need to take the multidimensionality of democratic support into account.

Moreover, at the individual level, there has been a sustained effort to identify the determinants of democratic support by comparing societies with different socioeconomic and political conditions.^[@LuChu2022 provides a summary of these established efforts and introduces a new approach using global barometer surveys.
The latent-variable-driven approach is also recently applied to levels lower than countries. 
See more methodological details in @CaugheyWarshaw2015 and demonstrations in @Caughey2018; @CaugheyWarshaw2018a.]
These studies contribute to understanding the psychological and sociological mechanisms behind the formation of democratic support, though many still treat democratic support as a one-dimensional scale---whether framed as support/opposition or support for procedural/outcome aspects of democracy.
Given the country-level findings in this paper, it would be valuable for future research to explore whether similar issues arise with respondent-level indicators of democratic support.
Such within-country discoveries could offer new insights into the formation and evolution of democratic support and help explain the inconsistencies in existing studies. 


# References {.unnumbered}

::: {#refs-text}
:::
